Using WordNet-Based Context Vectors To Estimate The Semantic Relatedness Of Concepts

نویسندگان

  • Siddharth Patwardhan
  • Ted Pedersen
چکیده

In this paper, we introduce a WordNetbased measure of semantic relatedness by combining the structure and content of WordNet with co–occurrence information derived from raw text. We use the co–occurrence information along with the WordNet definitions to build gloss vectors corresponding to each concept in WordNet. Numeric scores of relatedness are assigned to a pair of concepts by measuring the cosine of the angle between their respective gloss vectors. We show that this measure compares favorably to other measures with respect to human judgments of semantic relatedness, and that it performs well when used in a word sense disambiguation algorithm that relies on semantic relatedness. This measure is flexible in that it can make comparisons between any two concepts without regard to their part of speech. In addition, it can be adapted to different domains, since any plain text corpus can be used to derive the co–occurrence information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatedness

Humans are able to judge the relatedness of words (concepts) relatively easily, and are often in general agreement as to how related two words are. For example, few would disagree that “pencil” is more related to “paper” than it is to “boat”. Miller and Charles (1991) attribute this human perception of relatedness to the overlap of contextual representations of words in the human mind, and ther...

متن کامل

WordNet: : Similarity - Measuring the Relatedness of Concepts

WordNet::Similarity is a freely available software package that makes it possible to measure the semantic similarity and relatedness between a pair of concepts (or synsets). It provides six measures of similarity, and three measures of relatedness, all of which are based on the lexical database WordNet. These measures are implemented as Perl modules which take as input two concepts, and return ...

متن کامل

Using People and WordNet to Measure Semantic Relatedness

This technical report describes in some detail (1) the creation of a dataset for testing the degree of relatedness between concepts out of the data from Beigman Klebanov and Shamir’s lexical cohesion experiment [3, 5, 6], and (2) a new measure of semantic relatedness based on WordNet. We welcome comments on this manuscript; however, please refrain from citing it, but rather the concise publishe...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

An Approach to Improve the Representation of the User Model in the Web-Based Systems

A major shortcoming of content-based approaches exists in the representation of the user model. Content-based approaches often employ term vectors to represent each user’s interest. In doing so, they ignore the semantic relations between terms of the vector space model in which indexed terms are not orthogonal and often have semantic relatedness between one another. In this paper, we improve th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006